rank | frequency | n-gram |
---|---|---|
1 | 1986 | -s |
2 | 1690 | -o |
3 | 1201 | -a |
4 | 803 | -e |
5 | 550 | -r |
rank | frequency | n-gram |
---|---|---|
1 | 713 | -os |
2 | 565 | -as |
3 | 474 | -es |
4 | 450 | -ão |
5 | 401 | -do |
rank | frequency | n-gram |
---|---|---|
1 | 321 | -ção |
2 | 237 | -nte |
3 | 226 | -dos |
4 | 175 | -ado |
5 | 162 | -das |
rank | frequency | n-gram |
---|---|---|
1 | 212 | -ação |
2 | 195 | -ente |
3 | 154 | -ados |
4 | 116 | -ções |
5 | 109 | -ento |
rank | frequency | n-gram |
---|---|---|
1 | 124 | -mente |
2 | 103 | -mento |
3 | 87 | -idade |
4 | 68 | -ações |
5 | 57 | -ência |
The tables show the most frequent letter-N-grams at the ending of words for N=1…5. Everything runs in parallel to 2.2.5 Most frequent word beginnings. The aim is suffix detection instead of affix detection.
For N=3:
SELECT @pos:=(@pos+1), xx.* from (SELECT @pos:=0) r, (select count(*) as cnt ,concat("-", right(word,3)) FROM words WHERE w_id>100 group by right(word,3) order by cnt desc) xx limit 5;
2.2.5 Most frequent word beginnings